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Abstract 

This paper deals with probabilistic upper bounds for the er- 
ror in functional estimation defined on some interpolation and 
extrapolation designs, when the function to estimate is supposed 
to be analytic. The error pertaining to the estimate may depend 
on various factors: the frequency of observations on the knots, 
the position and number of the knots, and also on the error com- 
mitted when approximating the function through its Taylor ex- 
pansion. When the number of observations is fixed, then all these 
parameters are determined by the choice of the design and by the 
choice estimator of the unknown function. 
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1 Introduction 



Consider a function ip defined on some open set D C K. and which can 
be observed on a compact subset 5* included in D. The problem that we 
consider is the estimation of this function through some interpolation 
or extrapolation techniques. This turns out to define a finite set of 
points Si in a domain S included in S and the number of measurement 
of the function tp at each of these points, that is to define a design 
V := \ (si,7ii) GSxN, i = 0, S ^ S >. The points Sj are called 
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the knots, rii is the frequency of observations at knot Sj and I + 1 is the 
number of knots. The choice of the design V is based on some optimality 
criterion. For example, we could choose an observation scheme that 
minimize the variance of the estimator of (p. 

The choice of V has been investigated by many authors. Hoel and Levine 
and Hoel ([S] and 0) considered the case of the extrapolation of a poly- 
nomial function with known degree in one and two variables. Spruill, in 
a number of papers (see [12], [13], [TJ] and [15]) proposed a technique 
for the (interpolation and extrapolation) estimation of a function and its 
derivatives, when the function is supposed to belong to a Sobolev space, 
Celant (in |4] and [5]) considered the extrapolation of quasi-analytic 
functions and Broniatowski-Celant in [3j studied optimal designs for an- 
alytic functions through some control of the bias. 

The main defect of any interpolation and extrapolation scheme is its 
extreme sensitivity to the uncertainties pertaining to the values of <p on 
the knots. The largest the number I + 1 of knots, the more unstable is 
the estimate. In fact, even when the function tp is accurately estimated 
on the knots, the estimates of <p or of one of its derivatives pp) a t some 
point in D may be quite unsatisfactory, due either to a wrong choice 
of the number of knots or to their location. The only case when the 
error committed while estimating the values p(si) is not amplified in 
the interpolation procedure is the linear case. Therefore, for any more 
envolved case the choice of / and (sj, n,) must be handled carefully, which 
explains the wide literature devoted to this subject. For example, if we 
estimate p> (v) , v G SN^S 1 , by ip (sk) '■— <p (sk)+e (k) , where e (k) denotes 
the estimation error and S a Tchebycheff set of points S, we obtain 

(p (v) - p (s k ) < (max \e (k) | J A, (v, s k , 0) , 

where A; {v , s«, j) is a function that depends on S, the number of knots 
and on the order of the derivative that we aim to estimate (here 0), and 
(see and [H)] ) 

^max^Aj (f,s fc ,0) := ^Tctg ( jTrr \ 71 J ~ - hi (1 + 1) when / -)• oo. 

If equidistant knots are used, one gets (see [TTj ) 

2 l+1 

max A ; (v,Sk,0) ~ — — — -, 7 = 0,577 (Euler-Mascheroni constant). 

k=o,...,i eZ(mZ-|-7) 

When the bias in the interpolation is zero, as in the case when <p is 
polynomial with known degree, the design is optimized with respect to 
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the variance of the interpolated value (see [8] ). In the other cases the 
criterion that is employed is the minimal MSE criterion. The minimal 
MSE criterion allows the estimator to be as accurate as possible but it 
does not yield any information on the interpolation/extrapolation error. 

In this paper, we propose a probabilistic tool (based on the concen- 
tration of measure) in order to control the estimation error. In Section 
2 we present the model, the design and the estimators. Section 3 deals 
with upper bouns for the error. Concluding remarks are given in Section 
4. 

2 The model, the design and the estimators 

Consider an unknown real-valued analytic function / defined on some 
interval D : 

/: £> := (a, 6) R 

»^ / 0) • 

We assume that this function is observable on a compact subset S in- 
cluded in D, S := [s,s] C D, and that its derivatives are not observable 

at any point of D. Let S :— |sfc G S,k = 0, be a finite subset of 

/ + 1 elements in the set S. The points s k are called the knots. 
Observations Yi, i = 1, . . . , n are generated from the following location- 
scale model 

Yj{s k ) =f{s k ) + aE{Z j )+e j , 

Ej :=aZj - oE (Zj) , j = l,...,n k , k = 0,...,l, 

where Z is a completely specified continuous random variable, the lo- 
cation parameter / (v) and the scale parameter a > are unknown 
parameters. E (Z) , <j respectively denote the mean and the variance of 
Z, and n k is the frequency of observations at knot s k . 
We assume to observe (/+1) i.i.d. samples, Y_{k) := (Yi {n k ) , Y nk (n^)) , k 
0, /, and Yi (n k ) i.i.d. Y\ (n k ) , for all i k, i = 0, . . . , I. 
The aim is to estimate a derivative of / (v), f^iv), d £ N, at a point 
v E (a,s). 

Let (p (v) := / (v ) + o~E (Z), and consider the Lagrange polynomial 



LsM--= n 



v - Sj 



j^k,j=o Sk Sj 



We are interested in interpolating (or extrapolating) some derivatives of 
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ip, (p (d \ with d e N, 

I 

A:=0 

The domain of extrapolation is denoted U := D\S. It is convenient to 
define a generic point v E D stating that it is an observed point if it is a 
knot, an interpolation point if v £ 5 and an extrapolation point ifv&U. 
For all d £ N, for any t> £ 5, the Lagrange interpolation scheme con- 
verges for the function y > that is, for Z — > oo, 

£, (<pM) (s) ^ (a) , Vs £ 5. 

Interpolating the derivative <^( d+t ) (s*) at a point s* £ 5 opportunely 
chosen, a Taylor expansion with order (m — 1) of ipW(v) at point t> from 
s* gives 

(v — s ) 

T 



i=0 

and we have 



lim lim T v(d)jmtl (v) = V {d) (v) , Vt> £ D. 

When ^ £ C a (£>), Va, Z > 2a - 3, the upper bound for the error of 
approximation is given in [TJ, 

Et := sup \ V {d) (v) - T w ml (v) | < M(m, I, a), 
veD 

where M(m, I, a) = A (a, I) + B (m), 

m-1 , v 

A (a,/) :=K(a,l) sup (s) I - sup |v - s*N , 



i=Q 

\ a / i 



lf(a,/):=(2(i+])( s -5)J (9 + -ln(l + 



and -B (m) := sup 



vE(a,s) V m! 

The optimal design writes |(nfc, s^) £ (N\ {0})' +1 x 1R' +1 , n := ^L=o n fc> n fixed j, 
where n is the total number of experiments and the (I + 1) knots are 
defined by 

s + s s — s 2k — 1 

si. := cos — Ti, k — 0, . . . , I, 

2 2 21 + 2 ' 
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with rik '■- 



n-JVZ 



[3] for details) 
Pk 



Efc=o "/Pie. 



[.] denoting the integer part function, and (see 



m m , \a+B 
[U — S) 



EE 

13=0 a=0 



k = 0,...,Z. 



The function (p cannot be observed exactly at the knots. Let (p (sk) 
denote the least squares estimate of if (sk) at the knot Sk and 



d (y^) («):=E 



(1) 



k=0 



We estimate the d— th derivative of (u ) at t> G £) as follows 



m— 1 



8=0 



s* G 5. 



The knots Sk are chosen in order to minimize the variance of T L(<o m j (f ) 
and it holds 

lim lim lim min&=0i l(nfcHoo T^ W)nM (u) = <^ (d) (u) , 



Vv G D. 



iL,(d) m J (d) is an extrapolation estimator when v E U and an interpola- 
tion estimator when v & S. 

For a fixed degree Z of the Lagrange scheme ([I]), the total error committed 
while substituting </?w ( v ) by Tud) m i (v) writes 

£ Tot (<^) («)) («) - f^ |BM («) . 

For the interpolation error concerning tp(' l+d \ we have the following result 
presented in |6J, p.293 : if ip (i+d 1 G C a (S), Va, Z > 2a - 3, then 

sup \<p {d+i) (s) - d (<p^) (s)| < Mi := if (a,Z)sup (s)| . 



se5 



SG5 



This error depends on the very choice of the knots and is controlled 
through a tuning of Z. 

The error due to the Taylor expansion of order (m — 1) 



<^ (d) (v) - 



E N^> +i) (o 



i=0 



depends on s*, it is a truncation error and it can be controlled through 
a tuning of m. 
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Let ip(sk) be an estimate of ip(sk) on the knot Sk and 

e(k) := ifi(s k ) - (p(s k ), k = 0,...,l 

denote the error pertaining to (f(sk) due to this estimation, e (k) clearly 
depends on n^, the frequency of observations at knot sj~. 
Finally, when n is fixed, the error committed while extrapolating de- 
pends on the design {(n k , s k ) G (N \ {0}) m x K m , k = 0, . . . , I, n = 

Without loss of generality, we will assume a — 1. In this case we have 

— — - Y^t Y (k) 

<p (sfc) —Y (s fc ) := J — . The general case when a is unknown is 
described in [3]. 

In the next Section we will provide upper bounds for the errors in order 
to control them. 

Since ip is supposed to be an analytic function, we can consider the 
extrapolation as an analytic continuation of the function out of the set 
S obtained by a Taylor expansion from an opportunely chosen point s* 
in S. So, the extrapolation error will depend on the order of the Taylor 
expansion and on the precision in the knowledge of the derivatives of 
the function at s*. This precision is given by the interpolation error and 
by the estimation errors on the knots. The analyticity assumption also 
implies that the interpolation error will quickly converge to zero. Indeed, 
for all integer r, the following result holds: 




We remark that the instability of the interpolation and extrapolation 
schemes discussed by Runge (1901) can be avoided if the chosen knots 
form a Tchebycheff set of points in S, or if they form a Feteke set of 
points in S, or by using splines. 

Note that in all the works previously quoted the function is supposed 
to be polynomial with known degree (in [5] and [H]), to belongs to a 
Sobolev space (see [12], [13], [H] and [15]), or to be quasi analytic (in 
[1] and [5]), or analytic (in [3]). Moreover, S is chosen as a Tchebycheff 
set of points in S . 

Bernstein in [2j affirmed that polynomials of low degree are good approx- 
imations for analytic functions. In the case of the Broniatowski-Celant 
design ([3]), the double approximation to approach cp allows to choose 
any subset of S as possible interpolation set. So, if the unknown function 
is supposed to be analytic, then we can choose a small interpolation set 
in order to obtain a small interpolation error. 
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3 Upper bounds and control of the error 

The extrapolation error depends on three kinds of errors: truncation 
error, interpolation error and error of estimation of the function on the 
knots. In order to control the extrapolation error, we split an upper 
bound for it in a sum of three terms, each term depending only on one 
of the three kinds of errors. 

In the sequel, we will distinguish two cases: in the first case, we suppose 
that the observed random variable Y is bounded, in the second case Y is 
supposed to be a random variable with unbounded support. We suppose 
that the support is known. 

3.1 Case 1: Y is a bounded random variable 



If T\,T2 (assumed known) are such that Pr (t\ < Y < r 2 ) = 1, it holds 
\tp(v)\ < R, where R := max{|ri| , |r 2 |}. Indeed, E (Y) = ip e [-R,R] ■ 
Let 

e (k) : = — J - p (s k ) . 

n k 

The variables Yj (k) ,Vj = 1, ...,rik,Vk = 0, are i.i.d., with the same 
bounded support and for all k, E (Yj (k)) = <p (s^), hence we can apply 
the Hoeffding's inequality (in [7]): 



Pr{|e(fc)| > p} < 2exp 



2p 2 n k 



(t 2 - n) 2 



In Proposition 1, we give an upper bound for the extrapolation error 
denoted by E ext . This bound is the sum of the three terms, M Tay i or , 
controlling the error associated to the truncation of the Taylor expansion 
which defines p( d \ M interp , controlling the interpolation error and M est , 
describing the estimation error on the knots. 

Proposition 1 For all a 6 N \ {0} ; if <p {i+d) G C a (a, b), I > 2a - 3, 

then, \/u e U, \E ext (u)\ < M Tay i or + M interp + M est , where 

(d + m)\ fs*-u 

MTaiilnr '■— tt 



Taylor — ^^^j \ ) ~ ^ 



b-a J (b-a) c 



K(l,a) := ( 9 + -ln(l + Z) / " 



vr v 7 v 2 (i + 



M mterp := K (I, a) - - ^ d+a ^ [j^ ) 
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m—1 I 



III— ± l / % u 

*('■»*) -EE^^WrV)!, 



=0 k=0 



M est :=A(l,m) ( max \e (k)\ ) . 



Proof. By using the Cauchy's Theorem on the derivatives of the analytic 
functions, we obtain 



^ („) + £ f {S < 



r - E (« - - v^ (d) («) 



[u — s 



i=0 



< 



< 



m ~ l .Jd+i) ( q *\ 
(d) i \ ^ I s J / *\» 

(/T ; (w) — >^ ; (U — S ) 



7=0 

sup,. e , r ^ +m ^r 



?,! 



E ,, (« - - ^ (d) («) 



i=0 



I! 



(d+m) (,^| 



m! 



^ i2(m + d)! ^s* - it j 



< 



(6-a) d m! \ b-a 
R(m + d)\ (s* -u 



(b-afm\ \ b-a 

m—1 

< M Tay lor + E 



s - u) + 

771—1 

E- 

7=0 
777 777-1 

+ E 

7=0 



771—1 



E 

7=0 



u — s 



■X7 V^^ («*) / *N7 

•) - E — ^ M ~~ s ^ 



7 = 



s 1 - u 



^(d+i) ^ _ ^(d+i) 



S — It ) 



7 = 
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< M Tay loT + H 

7=0 ? " 



fc=0 



+ 



777—1 Z 



EE il_^ ir , (s . )k(s)t) _ FW | 



7=0 fc=0 



777—1 



< M Ta , ior + E lS , U> K (I, a) ( sup | v? (d+J+a) (a) | 



7=0 



r»7— 1 ( 



III— A. I / ^ \% 

+ Y.H (± ^ ±L THs')\^s k )-Y(k)\ 



7 = fc=0 
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< Baylor + E —JT^ K ^ S= /" 

m-1 / / * n i 

+ EE i V ii 2 +,| ( S *)kw-Fw| 



i=0 fc=0 

m— 1 2 



< M rayJor + M interp + -i ffi (o k - y ( fc )l 



i=0 fc=0 

m— 1 £ 



max | e (fc)| £ £ |L<J"> 
' i=0 k=o % ' 

= Mxaylor + Minterp + -^est- 



Proposition 2 yields the smallest integer such that the error of estimation 
is not greater than a chosen threshold with a fixed probability. 

Proposition 2 Vr] e [0, 1] , Vp e R+, 3n e N suc/i too* 

Pr ( max Is (k)\ > , .f — - ) < ry. 
\^=o,...,r _ A(l,m)J ~ 



Proof. If, Vfc |e > jjfa, then max fc=0 ,...,i \e (k)\ > j^f^. We have 
So, we can choose 

(Z + l)ln2-ln?7 /A(Z,m) (r 2 - r x ) 

n' = 



Proposition 3 gives an upper bound for the extrapolation error that 
depends on (7, m, n). We recall that the number of knots l+l controls the 
interpolation error, m denotes the number of terms used in the Taylor 
expansion for ip^ and n is the total number of observations used to 
estimate (s^) , k — 0, .., I. Hence n controls the total estimation error. 

Proposition 3 With the same hypotheses and notations, we have that 

V(p m ,Pl,Pn) e 1R(1R + ) 3 , \E ext (u)\ <p m + Pl+Pn 

with probability rj . rj depends on the choice of (p m , pi, p n ), which depends 
on (m, I, n) . 



(A(l,m)f nk 
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Proof. When (p m , Pi) is fixed , we can choose (m,l) as the solution of 
the system: 

(M Tay lor, Mi n terp) = {Pm, Pi) ■ 

We end the proof by taking p n = A ^ m ^ and n = n*. 
■ 

In the case of the estimation of f (u) (i.e., when d — 0) we obtain for 
the couple (to, n) the explicit solution 



to = 



In p m — In R 



In (s* — u) — In (b — a) ' 



71 = 



(Z + l)ln2-lnr/ /A(Z) (r 2 - n) 



m— 1 £ 



A«> = ££ 



i=0 fc=0 



s — u 



1 1 



142 (s* 



When Z > 2a — 3, / is the solution of the equation 



p, = 9 + -ln(l + Z) 

7T 



7T 



R 



2(1 + I) J (s-s) c 



m— 1 

£ 

i=0 



s* - tiV (i + a)! 



s — s 
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Theorem 4, due to Markoff, provides an uniform bound for the deriva- 
tives of a Lagrange polynomial. 

Theorem 4 (Markoff) Let Pi (s) := Ylj a j s '' be a polynomial with real 
coefficients and degree I. If sup sgS \Pi (s)\ < W, then for all s in intS 
and for all I in N, it holds 



P, U) (s) 



< 



f(f-l)...(f-(j-lf)( 2 V 



(2j - 1)!! 



(8-8) 



W. 



When applied to the elementary Lagrange polynomial, it is readily checked 
that W = it. Indeed, 



\L Sk (s) 



^^(IfeK) cos ((/ + !) 



l + l cos (9 - cos (fy-7r) 



< 



l sm ( 2 2 i+2 l7r )l 


cos ((1 + 1)0)1 


i+i 


cos 9 cos^tt)! 



21+2 
< 



< 



< 



l sm ( 2 2 i+2 l7r ) 




6- 


- ^TX 
21+2 




Z + l 




-) 


n 2k-l„\ 
2Z+2 7r l 



7T. 



We used 
|cos((Z + l)0)| 



2k — 1 

ccs ((7 ,1)0) -cwKl + l)——* 



<(l + l) 



9- 



2k -1 
21 + 2 



-IT 
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and cos ((/ + 1) fr^-7r) = 0. Moreover, 

a i 2fc-l 



cos 9 — cos 



21+2 

2k - 1 



2/ + 2 



-7T 



= 2 sin 



22+2 



71 



0- 



sm 



2fc-l 
2Z+2 



71 



The concavity of the sine function on [0, n] implies 

• Q , 2fc-l 



sin 



sin 



22+2 



7T 



2 

A _ 2jfc-l 



> 



sin + sin 



2k -1 



-IT 



21+2 



71 



2 


2k -1 


> - 


O : 7T 


71 


21 + 2 



21 + 2 
, e [0, 7T 



Remark 5 The Cauchy theorem merely gives a rough upper bound. In 
order to obtain a sharper upper bound, we would assume some additional 
hypotheses on the derivatives of the function. 

3.2 Case 2: Y is an unbounded random variable 

If the support of the random variable Y is not bounded and <p is a 
polynomial of unknown degree t, t < g — 1, with g known, it's still 
possible to give an upper bound for the estimation error. Since 



(d) 



9-1 \r^g-l j (d+i 



2^ i\ 



u — s 



i=0 



= E 



9-1 



U 



V. 



fc=0 



ip^ can be estimated as follows 



9-1 



fc=0 



We have in probability </?W — » for min (n^) — >■ oo. So, 
^ar (J«)=E(LSf(«)) 2 f ->0, 



fc=0 



where q is the variance of Z. We use the Tchebycheff's inequality in 
order to obtain an upper bound for the estimation error. For a given r], 

P r J U<i) _y,(«0 > ..I - 
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If we aim to obtain, for all fixed u, Pr j|</?( d ) — (pW > 77 j < to, we can 
choose n* as the solution of the equation fc ~° v ^ — = that is 



cur/ 2 

The integer [ra*] is such that the inequality Pr j|</?( d ) — <£>( d ) > r/| < ui 
is satisfied. 

We remark that if we know the degree t of the polynomial, then it is 
sufficient to set g — 1 = t. When <p> (u) = ip d (u) (i.e., d — 0), we have 

We underline that for d — and when t is known (m) = </? (u) 
coincides with Hoel's estimator. 

If the solely information on <p is that ip is analytic then we are constrained 
to give hypotheses on the derivatives of the function. More precisely, 
since Im(y9 C R, we can't apply the Cauchy theorem on the analytic 
functions; we can only say that (p (v) — E (Y) G R. So, we are not able 
to find a constant R such that |</?(i>)| < R. Moreover, since we can't 
observe ip (v) for v S, we don't have any data to estimate M Taylor . 
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